<html>
<head>
<title>15-740 Project Proposal: Virtual Application Profiler</title>
</head>
<body>

<center>
<h1>15-740 Project Proposal: Virtual Application Profiler</h1>
Sven Stork, Anthony Gitter
</center>

<h3>Project Description</h3>
Existing binary instrumentation tools can be useful for analyzing code performance and optimal hardware
structures for an application, but the slowdown from dynamic instrumentation can make the use of such
tools a tedious process. In addition, any programmer wishing to create custom dynamic analysis must first
learn to use an instrumentation library such as <a href="http://www.pintool.org/">Pin</a>.
<p>
We propose to implement a extensible framework for rapid prototyping and analysis of program behavior
on hardware structures. Our framework will first use Pin to instrument code and log a trace of a program's
execution, including information such as memory accesses, memory allocation, function calls, time, and symbol
names. The trace will be stored in a <a href="http://www.sqlite.org/">SQLite</a> database. Based on this stored
information, the programmer
can quickly and easily perform different kinds of analysis without re-executing the Pin-instrumented code.
<p>
One of the main benefits of this approach is that all intermediate information is directly accessible via
SQL queries, which enables programmers to build custom types of high level analysis with ease. However,
analysis is not limited to what can be expressed as SQL queries, as we will use the database to retrieve
information for more complex use cases. Equally important is the fact that the instrumented code only
needs to be run once, because iteratively tweaking parameters and rerunning instrumented code can be a
very slow process.


<h3>Goals</h3>
Our framework is iterative by nature, meaning we will first provide simple, core functionality and incrementally
add richer features.

<h4>75% goal</h4>
At a minimum, we will implement the basic data-collection functionality as a Pin tool and store the gathered
information in a SQLite database. We will provide basic memory profiling information and replay features.

<h4>100% goal</h4>
In addition to the functionality described in the 75% goal, we will provide more sophisticated performance
analysis, demonstrate such analysis can be built upon SQL queries, and compare the performance of our
approach with that of a pure Pin-based tool such as a cache simulator.

<h4>125%+ goal</h4>
If the project proceeds much more quickly than planned, we will continue to extend the richness of the trace
until the replay is essentially indistinguishable from the original program execution. At this point we could
provide very advanced analysis such as lock set verification, false sharing statistics, and the effects of huge
page tables.


<h3>Schedule</h3>
We propose the following schedule:
<ul>
	<li><b>Week 1</b>  Review related literature. Set up repositories and development environment.</li>
	<li><b>Week 2</b>  First design phase of the software architecture and database schema.</li>
	<li><b>Week 3</b>  Initial implementation of the Pin tool.</li>
	<li><b>Week 4</b>  Completion of the first prototype. Begin work on the data analysis.</li>
	<li><b>Week 5</b>  Either iterate and implement more advanced logging and analysis or fix shortcomings in the current
design/prototype.</li>
	<li><b>Week 6</b>  Evaluation of our tool and writeup. Create the poster.</li>
</ul>

Once we determine what types of information will be logged, the Pin tool implementation can proceed
in parallel with the SQLite database design and the development of the analysis features. This will allow us
to divide much of the work for weeks 2 through 5. The completion of the Pin tool is on the critical path for
the majority of the project because all downstream analysis depends on it.


<h3>Milestone</h3>
We expect to be able to achieve our previously described 75% goal by November 17.


<h3>Getting Started</h3>
Our project was designed so that we can begin work quickly. We require no special hardware or external
resources. Furthermore, both Pin and SQLite are freely available online, and we already have limited
experience with Pin from the first homework assignment.
<p>
There are a number of existing tools for memory profiling, the analysis of hardware effects on program
performance, and examination of program behavior. Some, such as MemSpy [2] are entirely instrumentation-
based tools, while others like SIGMA [1] log a trace for subsequent analysis in a manner similar to our
proposed framework. We have only begun to explore the set of related work and a literature search will be
the first major step of our project.


<h3>References</h3>
<a href="http://portal.acm.org/citation.cfm?id=762761.762783">[1]</a>
L. DeRose, K. Ekanadham, J.K. Hollingsworth, and S. Sbaraglia. SIGMA: A simulator infrastructure
to guide memory analysis. In <i>Proceedings of the 2002 ACM/IEEE conference on Supercomputing</i>, pages
1-13. IEEE Computer Society Press Los Alamitos, CA, USA, 2002.
<br>
<a href="http://portal.acm.org/citation.cfm?id=133057.133079">[2]</a>
M. Martonosi, A. Gupta, and T. Anderson. MemSpy: analyzing memory system bottlenecks in programs.
In <i>Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and
modeling of computer systems</i>, pages 1-12. ACM New York, NY, USA, 1992.

</body>
</html>